An Optimized Sanitization Approach for Minable Data Publication
نویسندگان
چکیده
Minable data publication is ubiquitous since it beneficial to sharing/trading among commercial companies and further facilitates the development of data-driven tasks. Unfortunately, minable often implemented by publishers with limited privacy concerns such that published dataset malicious entities. It prohibits may contain sensitive information. Thus, urgently demanded present some approaches technologies for reducing leakage risks. To this end, in paper, we propose an optimized sanitization approach (named as SA-MDP). SA-MDP supports association rules mining function while providing protection specific rules. In SA-MDP, consider trade-off between utility problem. address problem, designs a customized particle swarm optimization (PSO) algorithm, where objective determined both privacy. Specifically, take advantage PSO produce new particles, which achieved random mutation or learning from best particle. Hence, can avoid solutions being trapped into local optima. Besides, design proper fitness guide particles run towards optimal solution. Additionally, preprocessing method before evolution process algorithm improve convergence rate. Finally, proposed performed verified over several datasets. The experimental results have demonstrated effectiveness efficiency SA-MDP.
منابع مشابه
development and implementation of an optimized control strategy for induction machine in an electric vehicle
in the area of automotive engineering there is a tendency to more electrification of power train. in this work control of an induction machine for the application of electric vehicle is investigated. through the changing operating point of the machine, adapting the rotor magnetization current seems to be useful to increase the machines efficiency. in the literature there are many approaches wh...
15 صفحه اولa new approach to credibility premium for zero-inflated poisson models for panel data
هدف اصلی از این تحقیق به دست آوردن و مقایسه حق بیمه باورمندی در مدل های شمارشی گزارش نشده برای داده های طولی می باشد. در این تحقیق حق بیمه های پبش گویی بر اساس توابع ضرر مربع خطا و نمایی محاسبه شده و با هم مقایسه می شود. تمایل به گرفتن پاداش و جایزه یکی از دلایل مهم برای گزارش ندادن تصادفات می باشد و افراد برای استفاده از تخفیف اغلب از گزارش تصادفات با هزینه پائین خودداری می کنند، در این تحقیق ...
15 صفحه اولMinable Data Warehouse
Data warehouses have been widely used in various capacities such as large corporations or public institutions. These systems contain large and rich datasets that are often used by several data mining techniques to discover interesting patterns. However, before data mining techniques can be applied to data warehouses, arduous and convoluted preprocessing techniques must be completed. Thus, we pr...
متن کاملAn information retrieval approach to document sanitization
In this paper we use information retrieval metrics to evaluate the effect of a document sanitization process, measuring information loss and risk of disclosure. In order to sanitize the documents we have developed a semiautomatic anonymization process following the guidelines of Executive Order 13526 (2009) of the US Administration. It embodies two main and independent steps: (i) identifying an...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Big data mining and analytics
سال: 2022
ISSN: ['2096-0654']
DOI: https://doi.org/10.26599/bdma.2022.9020007